Cable news and COVID-19 vaccine uptake

COVID-19 vaccines have reduced infections and hospitalizations across the globe, yet resistance to vaccination remains strong. This paper investigates the role of cable television news in vaccine hesitancy and associated local vaccination rates in the United States. We find that, in the earlier stages of the vaccine roll-out (starting May 2021), higher local viewership of Fox News Channel has been associated with lower local vaccination rates. We can verify that this association is causal using exogenous geographical variation in the channel lineup. The effect is driven by younger individuals (under 65 years of age), for whom COVID-19 has a low mortality risk. Consistent with changes in beliefs about the effectiveness of the vaccine as a mechanism, we find that Fox News increased reported vaccine hesitancy in local survey responses. We can rule out that the effect is due to differences in partisanship, to local health policies, or to local COVID-19 infections or death rates. The other two major television networks, CNN and MSNBC, have no effect. That, in turn, indicates that more differentiated characteristics, like the networks’ messaging or tendency for controversy, matter and that the effect of Fox News on COVID-19 vaccine uptake is not due to the general consumption of cable news. We also show that there is no historical effect of Fox News on flu vaccination rates, suggesting that the effect is COVID-19-specific and not driven by general skepticism toward vaccines.


S1 Further information on data sources
This appendix section provides additional information about our data. Summary statistics are reported in Table S1.

S1.1 Channel positions
Channel positions for FNC, MSNBC and CNN come from the Nielsen FOCUS database, which reports channel lineups of all U.S. local broadcast systems, with information about the area served by the system at the zipcode level. We use channel positions from 2016, the latest year for which we have access. We aggregate the data at the county level, by averaging zipcode-level channel positions with weighting by population size. To address the presence of outlier channels, we winsorize the variables at the top and bottom deciles.

S1.2 Channel viewership
Television viewership by county of FNC, MSNBC and CNN, is provided by Nielsen. The measure is"ratings," which is proportional to the number of minutes that each household tuned in to each specific channel during the months of January and February 2020. We standardize the viewership throughout the paper by its standard deviation for all networks. Some counties in the raw data were split in parts (e.g. North County-A, East County-A), and were aggregated together by simple average.

S1.3 COVID-19 cases and fatalities
Confirmed COVID-19 cases and fatalities are from the The New York Times (https://github.com/nytimes/ covid-19-data). The dataset is already at the daily and county level and starts in January 2020. As these variables are expressed in a cumulative manner, therefore we generate new daily cases and fatalities by subtracting observations of day n − 1 from day n.
We further calculate weekly estimates by summing over daily estimates by calendar weeks. The New York Times states that cumulative cases can sometimes decrease after a state corrects a mistake in reporting. When we observe such a correction, we set weekly observations to missing if the weekly sum of daily observations is negative, too.

S1.4 COVID-19 vaccinations
The COVID-19 vaccine roll out in the United States began in December 2020, following federal guidelines and CDC recommendations (state governments had the possibility to adjust the vaccination strategy depending on their demographic, health care system, and COVID-19 situation): First, healthcare workers were vaccinated (starting December 2020), then people aged over 65 (end of January 2021), people with medical conditions and disabilities (mid of March 2021), and people aged above 50 (April 2021). This was finally followed by an expansion of eligibility to virtually the whole population by May 2021 (https: //www.cdc.gov/coronavirus/2019-ncov/vaccines/recommendations-process.html). Some states even started to distribute the vaccine to all adults in late March (e.g. Alaska, Georgia, Mississippi, Ohio, and Texas).
The statistics on full COVID-19 vaccinations (final doses) come from the Centers for Disease Control and Prevention (CDC), who aggregate data reported by state health agencies, jurisdictions and federal entities. The CDC data contain county-level numbers for most U.S. states and territories. Statistics for Texas and Hawaii are only reported at the state level -we thus do not include those states in our analysis. Also, California does not report data on counties with fewer than 20,000 inhabitants, so we cannot include those counties in our analysis. Taken together, the vaccination data include roughly 2950 counties, of which around 2750 are used for the main analysis. This difference is due to a lack of data on viewership, for example for the states/territories of Alaska, Guam, Puerto Rico, and the Virgin Islands.
Data access is possible via CDC's dedicated API. We collect information for the first of January 2021 and onwards. Similarly to the COVID-19 case and fatality numbers, vaccination data is provided in a cumulative manner. Generating Monday-to-Sunday weekly vaccinations, we end up with weekly data starting January 11. The latest data (until end of 2021) have been downloaded with the same methodology in April 2022.
Vaccination estimates refer to full vaccinations, which generally comprise two doses spaced by 4 to 5 weeks unless only one dose is required (e.g. J&J/Janssen vaccine). The CDC neither account for the timing between the two doses, nor for the delay after which the vaccine becomes effective. Note that counties refer to the individual's county of residence, not the county where they got vaccinated. Data is available for the total population, for adults aged between 18 and 64 years old, and for adults aged 65 or older.

S1.5 COVID-19 vaccination hesitancy
Vaccination hesitancy data originate from the Census Bureau's Household Pulse Survey (HPS) conducted in the first half of March 2021. In the survey, respondents were asked to answer to "Once a vaccine to prevent COVID-19 is available to you, would you ... get a vaccine?" with: (1) "definitely get a vaccine", (2) "probably get a vaccine", (3) "probably not get a vaccine", (4) "definitely not get a vaccine." The CDC use the HPS to estimate hesitancy rates at the state level, followed by an estimation at the Public Use Microdata Areas (PUMA) level using the Census Bureau's 2019 American Community Suvery 1-year Public Use Microdata Sample. Then, county estimates are generated using the Missouri Census Data Center PUMA-to-county crosswalk. For PUMAs overlapping with multiple counties, data is averaged using the 2010 Census populations. This data, we retrieved from the CDC's website https://data.cdc.gov. S1.6 Heath care data Data on ICU units and the number of hospitals for 2438 counties are provided by Kaiser Health News 1 (KHN). Bed counts stem from the hospitals' financial cost reports, filed annually to the Centers for Medicare & Medicaid Services. For more information on their methodology and data sources check: https://khn.org/MTA2ODgzMw. S1.7 Data on the counties' ability to adequately react to COVID-19 The CDC report a variable measuring the counties' ability to handle a COVID-19 outbreak, ranging from 0 to 1 with 1 being the most vulnerable. Data originates from the Surgo COVID-19 Vaccine Coverage Index (CVAC), which is based on measures for access to health care, affordable housing, transportation, childcare, and safe and secure employment to predict how well counties can handle a COVID-19 outbreak.

S1.8 Seasonal flu vaccination
To our knowledge, there is no county level dataset on seasonal flu vaccination that covers all individuals in the United States. We use flu vaccination rates among Medicare fee-for-service enrollees, provided by the County Health Rankings & Roadmaps program from the University of Wisconsin. They take data from the CMS Office of Minority Health's Mapping Medicare Disparities (MMD) data tool that reports various health outcomes at the county level. The influenza vaccination prevalence rates are calculated by searching for the respective diagnosis code in Medicare beneficiaries' claims and dividing the sum by all Medicare beneficiaries in counties. Other recent papers also use this data source in the COVID-19 context (e.g. 2 ).
We take county level data for all 50 states from the 2019, 2020 and 2021 reports, each containing estimates for the year three years before the report date. Therefore, in our analysis we can integrate the flu vaccination rates of fee-for-service enrollees for the years 2016, 2017 and 2018. We understand that this data is a sub-sample of the U.S. population that have access to Medicare. In 2019, 58 million Americans were covered by Medicare, which represents health coverage benefits for most adults aged 65 and older (Census Bureau, Health Insurance Coverage in the United States: 2019, issued September 2020).

S1.9 Data on the counties' ability to adequately react to COVID-19
The CDC report a variable measuring the counties' ability to handle a COVID-19 outbreak, ranging from 0 to 1 with 1 being the most vulnerable. Data originates from the Surgo COVID-19 Vaccine Coverage Index (CVAC), which is based on measures for access to health care, affordable housing, transportation, childcare, and safe and secure employment to predict how well counties can handle a COVID-19 outbreak.

S1.10 Demographics and politics
Other local socio-economic variables come from the 2010 U.S. Census. These variables are all at the county level and include, following the specification in 3 : population, population density, land area, working-age share of population aged 20-69 over other ages, proportion eligible for food stamps, proportion who never attended no high school, proportion who attended college, a dummy for an above-median Black population share, a dummy for an above-median white population share, and the proportion of males.
We also use data on the Republican vote share of the 2012 and 2016 presidential elections. We further expand this set of controls in the robustness check by furthermore including from the 2010 U.S. Census: proportion who attended high school, a dummy for above-median Hispanic population share, a dummy for above-median Asian population share, proportion belonging to middle income categories (20-25k, 25-30k, 30-35k, 35-40k, 40-45k, 45-50k), proportion working in occupation categories (management and professional, services, sales and office, construction, extraction and maintenance, production, transportation and material moving). In the supplementary materials robustness checks, we complement the Census variables with further occupation shares from the 2019 American Community Survey (ACS), which are also provided by the Census Bureau. The ACS occupation shares comprise the medical, retail, agriculture, industrial, and transport sectors.

S1.11 Further Data
Other data sources are used in the supporting analysis. Gallup Polling Social Series: political party self-identification and ideology for the years 2016-2020. Television viewership data from the American Time Use Survey for the period 2015-2019 (338 counties).

S2.1 Weekly evolution of COVID-19 in the U.S.
In Figure S1, we report weekly full vaccinations per age group in our sample of counties, and the average percentage of fully vaccinated individuals by age group. Vaccination efforts began in mid-December 2020 for the elderly and healthcare workers. We observe a sharp increase in full vaccinations for the week of April 12th, one month after several states began to open vaccination to all adults. Figure S2 reports the number of COVID-19 confirmed cases and fatalities for the same set of counties.

S3 Instrument validity
We check here the two assumptions that are made for every IV analysis: relevance and exogeneity.

S3.1 First stage
First, we check for relevance, i.e. that the instruments -the networks' channel positions -are correlated with the networks' viewership. Figure S3 shows the baseline negative correlation between channel position instrument and the relative network viewership, meaning that a lower channel position in the lineup is associated with higher viewership. .5 .6 .7 .8 .9 Nerwork Viewership (Standardized) We attempt to address the problem of a weak first stage -as faced in previous literature 3,4 for the networks CNN and MNBC -by accounting for the relative lineup position of the networks. For that purpose, we are taking the channel position of the network with highest viewership, FNC, for the calculation of the other networks' relative positions. While this alternative instrumentation improves the prediction of the viewership measures slightly, we still observe a weak first-stage F-statistic of 4.45 for MSNBC and 2.53 for CNN, and warn the readers of this caveat in our analysis in the main text. Coherent with other studies, we obtain a strong first-stage F-statistic of 13.85 for the Fox News Channel. We furthermore show in Figure S6, that our results are virtually identical for FNC and CNN when using the standard instrumentation approach -of directly using the network's channel position on its viewership -but become noisy for MSNBC -with a first-stage F-statistic of 0.2.

8/32
Channel Position   Table S4. Balance checks on channel position instruments. We report in the table above a correlation check between our instruments and local characteristics. We run reduced form regressions with the specified characteristic as the outcome and other controls still on the right-hand side. Coefficients are standardized by the standard deviation. * p < 0.05, and * * p < 0.01.
We investigate these unbalanced characteristics by checking if they are systematically correlated with the outcome of interest. We include these characteristics as controls in our main specification and we furthermore verify our results to be robust when adding them as polynomials ( Figures S4)

S3.3 Alternative IV approach
In the main results, the CNN and MSNBC viewership measures are instrumented with the relative channel position of the other two channels, respectively. Figure S6 presents Figure S6. Main results using the same IV for all three networks, total population, 2021 (2SLS). Significance levels: * p < 0.10, * * p < 0.05, * * * p < 0.01.  Figure S7. Divergent narratives on vaccines by network in May-June 2021. Each panel presents the smoothed frequencies of each phrase in the major news channels. The figure was produced using the GDELT Television Comparer API (accessible at https://api.gdeltproject.org/api/v2/summary/summary). The GDELT API is a flexible news search interface where one can specify specific networks and specific days to provide statistics on relative frequencies for user-provided search queries. Figure S8 reports the OLS estimates of the main results by age categories for all three channels.  Figure S8. Effect of networks viewership by age category, 2021 (OLS).

S7.2 Results with extended period of analysis
In Figure S9, we show the results of the main model for all weeks of 2021 (starting with the week of January 11th, 2021). Since the geographical coverage of COVID-19 full vaccinations is smaller for early weeks, the county composition of regressions before our main results is not the same and affects instrument relevance.  Figure S10 shows main results controlling for the Republican share at the 1992 and 1996 presidential elections. In Figure S11, we interact each presidential election variable with the respective channel position instrument. 19/32 Figure S12 shows the main results controlling for the self-reported Republican and conservative affiliation from Gallup. We also interact those two estimates (2012-2019 and 2016-2019) with the respective channel position instrument in Figure S13.

S7.4 Television viewership checks
In Table S7, we use data on time spent watching TV (Average Time Use Survey) as an outcome in our analysis. Results show that the individual viewership of each network has no impact on the time spent watching TV. In Figure S14, we also show that our main results are robust to the inclusion of this variable as control considering the reduced sample size.  Table S7. Daily television viewership as alternative outcome (2SLS). Standard errors in parentheses. * p < 0.10, * * p < 0.05, * * * p < 0.01.  Figure S15 shows our results controlling for COVID-19 cases and fatalities. In Panel (a), we control for cumulative confirmed cases and fatalities for the same week as the outcome. In Panel (b), we control for weekly new confirmed cases and fatalities lagged by four weeks compared to the outcome. We observe that results largely remain unchanged under these specifications. The same is true when interacting these measure with the instruments (cf. Figure S16).  Table S8 reports the impact of viewership on vaccine hesitancy and the ability to handle a COVID-19 outbreak. When using those variables as outcomes in our main analysis, we observe a positive effect at the 10% level on vaccine hesitancy for FNC. When we instead included these variables as controls, we still see a negative and statistically significant effect on COVID-19 vaccinations for FNC after the week of May 10th (cf. Figure S17).

S7.5.2 Vaccine hestiancy and outbreak risk
Alternative COVID-19 Outcomes % Hesitant % Hesitant % Hesitant Outbreak concern Outbreak concern Outbreak concern FNC Viewership 0.0369  Figure S17. Main results controlling for COVID-19 vaccine hesitancy and CVAC level of concern, total population, 2021 (2SLS).  Figure S18. Main results controlling for ICU beds and hospitals, total population, 2021 (2SLS). Table S10 reports the effect of the networks viewership on counties share of influenza vaccination in 2016, 2017 and 2018. We do not observe a statistically significant effect for the three networks. Figure S19 show that our main results or robust to the inclusion of these variables as controls.

S7.6 Specification checks
In Figure S20, we present the main results by excluding one control variable at a time (leave-one-out test). Therefore, 11 regressions are stacked on the same graph, leaving out each time one of the socio-demographics and political preferences controls from the main specification. March-29th April-12th April-26th May-10th May-24th June-7th June-21st Figure S20. Leave-one-out test with control variables, total population, 2021 (2SLS). Figure S21 shows the results using different sets of covariates. In Panel (a), we exclude the socio-demographic and political preferences controls, keeping only the controls for viewership and channel positions for the other networks. In Panel (b), we include a greater set of covariates: for ethnicities (a dummy for above-median hispanic population share and asian population share), education (proportion who attended high school), the share of population living in zip codes with access to the network, and a set of controls for the share of the population employed in different sectors: management and professional, services, sales and office, construction, production, arts/design/entertainment. In Figure S22, we replace the Census Bureau occupation variables from Figure S21 Panel (b) with more recent occupational data from the American Community Survey (ACS), also provided by the Census Bureau. The survey includes share from the following sectors: medical, retail, agriculture, industrial and transport occupations.

S7.7 Sample checks
In Figure S23, we present the main result following a perturbation test that excludes one of the 47 states at a time. Therefore, 47 regressions are stacked on the same graph, the dots represents estimates and the colored area represents the overlaid 95% confidence intervals. March-29th April-12th April-26th May-10th May-24th June-7th June-21st Figure S23. Perturbation test by leaving one state out, FNC channel, total population, 2021 (2SLS).